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A method for tuning parameters in Monte Carlo generators is described and applied to a specific 
case. The method works in the following way: each observable is generated several times using 
different values of the parameters to be tuned. The output is then approximated by some analytic 
form to describe the dependence of the observables on the parameters. This approximation is used to 
find the values of the parameter that give the best description of the experimental data. This results 
in significantly faster fitting compared to an approach in which the generator is called iteratively. 
As an application, we employ this method to fit the parameters of the unintegrated gluon density 
used in the Cascade Monte Carlo generator, using inclusive deep inelastic data measured by the 
HI Collaboration. We discuss the results of the fit, its limitations, and its strong points. 
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I. INTRODUCTION 

' The substructure of the proton is parametrized in terms of parton distribution functions (PDFs) . In perturbative 
. QCD the PDFs are given by solutions of integral equations, for which the initial input distributions have to be 
determined by global fits to the available experimental data (see, e.g., [l| and references therein). All present global 
, fits are based on fixed-order calculations in a s , the strong coupling constant, and on factorization theorems that apply 
' to specific inclusive processes, where most of the final-state properties are integrated over. 
I/"") \ To study more exclusive processes (i.e., multiparticle production or multidifferential cross sections), Monte Carlo 
r**"* : (MC) event generators are used. The physics included in the generators is often not the same as the one described by 
the factorization theorems and used in global fits. For instance, most of the generators do not implement complete 
next-to-leading-order (NLO) QCD corrections, but on the other hand they implement parton showers, which partially 
' take into account all-orders resummation effects. 

Due to these differences, in principle using in the MC generators the PDFs extracted from global fits is not fully 
consistent. Ideally, the PDFs should be fitted directly using a MC event generator Q, together with all other extra 
parameters of the generator. Unfortunately, the parameters of a generator are difficult to tune efficiently because 
minimization programs require several sequential calls of the generator. This can be extremely time-consuming, 
especially for more exclusive events. 

Motivated by Refs. Q, we are using a fast and efficient method to fit generator parameters. The method is based 
on using a MC event generator to produce a grid in parameter space for each observable. The parameter dependence 
is then approximated by polynomials before the fit is performed, which significantly reduces the fitting time. This 
method has also been recently used in Ref. Q. 

As an application, we tune the parameters of the unintegrated gluon distribution function (also called transverse- 
momentum-dependent gluon distribution function) using the Cascade MC event generator [5| , by fitting the generator 
predictions to inclusive deep inelastic scattering data measured by the HI Collaboration [6|. We explore the reliability 
and the limitations of the method and study to which extent the data can constrain the input parameters. 

The paper is organized as follows. In section|TT]we give the details of the fitting method. We give a simple example 
which we generalize to several parameters and observables. In section lnTl we discuss how the fitting method is applied 
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to a specific case and the results of the tune are presented in section ITVl We draw some conclusions at the end of the 
paper. 

II. TUNING METHOD 

In general, the goal of the tuning is to describe a set of N experimental observables Yf, with errors SYf*, by means 
of a theoretical model (in this case a MC generator) that depends on the parameters a a , and predicts the observables 
to be K M (a a ), with errors SYf^ (a a ). The values of the parameters that give the best description of the data can 
be found by minimizing the x 2 function 



N 



[Y™ c (a a )-Y? 



X 2 (a a ) = > v % '-^—s. (1) 

Usually, the minimization is done by numerical programs such as MINUIT |7j . The generator predictions have to be 
computed typically a few hundred times for different choices of the parameters before the minimum is found. This 
"brute-force" procedure is highly time consuming. 

An alternative approach has been used in, e.g., Ref. [|| as early as twenty years ago, and more recently in Refs. HE!- 
First, for each observable a grid in parameter space is built, running the MC generator with several values of the 
parameters. Secondly, the grids are approximated by analytic functions of the parameters, usually polynomials. These 
functions give a fair description of the generator output and can be used in its stead. In this way, finding the parameter 
values that best fit the data becomes a much faster task. 

The method turns out to be particularly time efficient. A fitting procedure typically requires to sequentially calculate 
X 2 a few hundred times for different values of the parameters. Building the grids in parameter space also requires 
running the MC generator a few hundred times, but each computation can be done independently in parallel. Once 
the grid is built and approximated analytically, minimizing the % 2 is extremely fast. It becomes very convenient to 
run the minimization with different initial values of the parameters, or including only a subsample of the observables. 
However, if new data points are added, a new grid has to be produced for each new data point. 

A. A simple example 

To illustrate the method, we start from a simple example. Suppose we need to fit two data points Yy* and Y 2 ex with 
their errors (e.g., two cross-section measurements) using a MC generator with one tunable parameter a. In Fig. [T] 
we indicate the two data points with solid horizontal lines with their error bands. 

First, we choose 5 values (J = 1, . . . , 5) of the parameter a and generate 5 predictions for each observable, i.e., two 
grids (aij, Y 1 MC (aij)) and (ai J -,l^ MC (ai J )), with statistical errors due to the Monte Carlo method. In Fig.[TJ these 
grids are indicated by points (the errors are too small to be visible). 

Then we choose an analytical form to approximate the two grids, which will be a function of a, but also of two new 
sets of parameters A\,B\, . . . and A2, B2, ■ ■ ., one for each grid. To avoid confusion, with denote these new parameters 
as "grid parameters," to be distinguished from the original MC parameters. In principle, the functional form itself 
could be different for each distinct grid, but in practice it is more convenient to choose the same form. To make the 
procedure easier, it is a good idea to choose a function that is linear in the grid parameters, for instance a third-degree 
polynomial 

Y* pp (a; Ai, Bi, C % , A) = M + B t a + Q a 2 + D t a\ (2) 

The best values of the grid parameters are chosen by means of a x 2 minimization for each separate grid. We define 
this procedure as "grid approximation," to be distinguished from the actual fit to the experimental data. We define 
in this case 

*<*•*•■■■>-£ [S^W ' (3) 

The polynomials obtained using the best- fit parameter values, Ai,Bi, . . . are indicated as a curved solid line in Fig. [T] 
The xf analysis allows us also to estimate the errors bands on the grid approximations (not visible in the figure). 

At this point, it is useful to remark that the degree of the polynomial introduced in Eq. [5] is a matter of choice. 
Usually, the higher the degree, the better the description of the grids becomes. However, from a certain point on, 
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FIG. 1: Example of the fit procedure applied to a single parameter and two observables: the horizontal lines with bands 
represent the experimental values and errors of the observables, the points indicate the grids predicted by the MC generator 
for different values of the parameter on the x axis, the curved lines represent analytical approximations to the grids, and the 
vertical lines indicate the best-fit value of the parameter. 



adding an extra degree does not improve the quality of the approximation significantly, i.e., it does not change 
significantly the sum of the minimum \ 2 - I n the case shown in Fig. [TJ it turns out that a third-degree polynomial 
gives a much better description of the grid than a second-degree polynomial, while the fourth-degree polynomial does 
not significantly improve the situation. 

Once we have analytical approximations of the Monte Carlo generated grids, we can finally fix the best value of the 
parameter a by minimizing the function 

s(a)=r r^(«;4A"0-*r] a (4) 

In Fig. [1] the best-fit parameter value, a\, is indicated as a straight vertical line. 



B. The general case 

Generalizing the above example, with N experimental points (denoted by the index i) and P parameters (denoted 
by the index a), we need to build N grids in (P + l)-dimensional spaces, (ct a ,j a ,,Y, j MC (aaj a )) ■ If we choose J a points 
for each parameter, the generation of the grid requires J — Yia=i ^ a Monte Carlo runs. Once the grids are built, we 
approximate them using polynomials of degree n (for simplicity we show here explicitly only the terms up to second 
degree) 

p p p 

Yi(Ai,B ita ,Ci !ab , ...)= Ai +^B ita a a +^^C iiab a a a b + ... . (5) 

a— 1 a—b b—1 

Note that the Monte Carlo parameters a a are the variables of the polynomials, while the grid parameters are the 
coefficients. For degree two and higher, the off-diagonal terms like C^at, a ^ b, take into account correlations between 
the Monte Carlo parameters. In our application, we found that third-degree polynomials give a good description of 
the grid. Advancing to fourth-degree polynomials does not lead to significant improvements. 

The total number of coefficients for a degree- n polynomial of P parameters is M — X)fc=o ^WPT~- For instance, 
a polynomial of third degree of four Monte Carlo parameters has 35 coefficients. For simplicity, we denote them 
collectively as A hS , where A iA = Ai, A i}S = S ijQ for s = 2, . . . , P + l, A hS = d^b for s = P+2, . . . , P+2 + P(P+l)/2, 
etc. 
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The values of the coefficients that give the best approximation to the grid are obtained by minimizing 

Since the fit function is linear in the coefficients, the best way to perform the x 2 minimization is to use Singular 
Value Decomposition (SVD) 9]. SVD is based on the fact that the relation between the observables and the grid 
parameters Ai^ s can be written as an over-determined system of linear equations. SVD provides a solution to this 
system in a least-squares sense. Compared to other, more general numerical minimization procedures (such as the 
ones implemented in MINUIT), SVD is faster and guarantees that the true % 2 minimum is found. The solution does 
not depend on the choice of the initial values of the parameters. This is particularly important when the minimization 
involves several dozens of parameters. 

The approximation procedure returns the best-fit values, Ai tS , of the coefficients and a covariance matrix that can 
be used to estimate the statistical error bands on the approximation, 6Y? pp (a a ; A4 s ) by means of error propagation. 

Once the grids are approximated by polynomials in Monte Carlo parameter space, we finally want to choose the 
values of the parameters a a that give the best description of the data. To correctly take into account systematic 
errors in the experimental measurements, the \ 2 function has been computed using [10| 

N \Y app (n - A- \ — V cx 4- V" s " r' 2 1 2 ™ ay3 

X 2 = T [l K ' A '' s) „ S + Lfc=1 2 fcJ +Tr' k \ (7) 

where the random parameters r' k are defined in App. A. The minimization is done in this case using MINUIT, since 
the dependence on parameters a a is non-linear. 

The tuning method studied in Ref. Q is essentially the same as the one considered here. The main differences 
between the two implementations reside in the definition of the \ 2 function, which in our case include the statistical 
error in the grid approximation (SY^ ) and the contribution of correlated systematic uncertainties. 



III. AN APPLICATION: FITTING THE UNINTEGRATED GLUON DISTRIBUTION FUNCTION IN 

CASCADE 

The fitting method described before is general and may be applied to tune any parameter in any Monte Carlo gen- 
erator. At present, however, we want to concentrate on tuning the parameters of the unintegrated gluon distribution 
function (uGDF) - also known as transverse-momentum-dependent gluon distribution function - in the Cascade 
MC generator. 

A brief introduction to the Cascade event generator is in order. For a more detailed description we refer the 
reader to Cascade is a hadron level Monte Carlo event generator for ep, jp and pp processes, which uses the 
CCFM evolution equation for the initial state parton shower supplemented with off-shell matrix elements for the hard 
scattering. To simulate the hadronization process, Cascade uses the Lund string model [ill ]. 

The CCFM equation is a linear integral equation which sums up the cascade of gluons under the condition that 
subsequent emissions are angularly ordered. With this ordering it interpolates between DGLAP (resummation of 
transverse momenta a™ ln ra kf) and BFKL (resummation of longitudinal momenta a™ ln n x) limits. 

In Fig. [2] we show schematically a parton ladder defining the kinematic variables which we use in equations below. 
The CCFM equation reads: 

f 1 dz f d 2 q _ _ fx \ 

A(x,kt,q) = A (x,h,q) + — — 2 ^ - zq)A s {q,zq)P gg {z,q,kt)A\ -,k[,q ) (8) 

where A (x,k t: q) is the input distribution, x denotes the longitudinal momentum fraction of the proton carried by 
the gluon, k t is the 2-dimensional transverse momentum of the t channel gluon, z — x/x' is the splitting variable, 
q is the factorization scale specified by the maximum allowed angle S between the partons in the matrix elements, 
k' t — \kt + (l — z)q\. We also introduced q as a shorthand notation for the 2-dimensional momentum q = q t — p g j (1 — z). 
The Sudakov form factor (which we do not write explicitly) A s (g, zq) for inclusive quantities regularizes the 1/(1 — z) 
collinear singularity of the splitting function P gg (z, q, k t ). 
The input distribution can be written as 



A (x,k t ,q) = A (x,k t ) A s (q,Qo). 



(9) 
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FIG. 2: Schematic view of a parton ladder illustrating the kinematic variables used in the text. 

We choose to parametrize the distribution at the starting scale Qo — 1.2 GeV in the following way 

xA (x, kt) = Nx- B (l - x) c {l - Dx)e- {kt - f * )2/ ' 72 (10) 

where N, B, C, D, p,, a should be in principle determined from fits. In practice, for the purpose of the present study 
we fix C = 4, /i = GeV, a = 1 GeV Q- The value of parameter C is dictated by the spectator counting rules (l2j) : 
since at large x gluons are suppressed as compared to quarks, C for gluons has to be larger than 3. Previous studies 
suggests C = 4 Parameter D, typically included in global fits of the PDFs (see, e.g., [H], El), was set to zero in 
earlier studies with Cascade [I3,[lall3|- As we w ^ show later, the addition of this parameter substantially improves 
the description of the data we consider. 

The parameters of the starting uGDF, AT, B, and D in Eq.[l0l arc determined by fits to the F2 structure function in 
inclusive deep inelastic scattering, ep — > e'X, as measured by the HI Collaboration [6j. We chose this data set in order 
to compare our results with earlier determinations of the uGDF. The measurement was made at the electron-proton 
center of mass energy y/s = 300.9 GeV within the kinematic range 1.5 < Q 2 < 150 GeV 2 , 3 x 10 -5 < xbj < 0.2. 
Here Q 2 is the virtuality of the exchanged boson, and XBj is the Bjorken scaling variable. The measurements cover 
the small- a; Bj region where gluon- induced processes dominate and we should have a good sensitivity to the values of 
the parameters in the uGDF. In total, there are 122 data points binned in x B j and Q 2 . 

We considered two different cases: in the first case we restricted ourselves to xbj < 0.005 and Q 2 > 4.5 GeV 2 , as 
in most of the available Cascade tunes [ll, [li| ; in the second case, we extended the range to the whole data set. 

In summary, we performed four kinds of fits: 

Fit 1. x B j< 0.005 and Q 2 > 4.5 GeV 2 , D = in Eq. (flU]). 

Fit 2. x Bj < 0.005 and Q 2 > 4.5 GeV 2 , D ^ in Eq. JTQJ, 

Fit 3. full x Bj and Q 2 range, D = in Eq. (fUTJ). 

Fit 4. full x Bj and Q 2 range, D ^ in Eq. (ITU|) . 

The grid in parameter space was built in two different ways depending on whether parameter D was set to zero or 
treated as a fit parameter. The final grids were chosen after performing rough fits with wider grids. In Fit 1 and 3, we 
chose N = [0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9] and B = [-0.05, -0.025, 0.0, 0.025, 0.05, 0.075, 0.1, 0.125] for a to- 
tal of 72 grid points. In Fit 2 and 4, we chose N = [0.30, 0.38, 0.46, 0.54, 0.62, 0.70], B= [0.00, 0.05, 0.10, 0.15, 0.20], 
and D = [—12, —10, —8, —6, —4, —2, 0], for a total of 150 grid points. For each point 2.5 million events were generated. 
Each generation takes a few hours of computing time and can be run in parallel. 

Describing the grid with a third degree polynomial is in our experience the best choice. The quality of the grid 
approximation is very good, with an average x 2 / n -d-f of 1.08, 1.05, 1.11, 1.12 for Fit 1 to 4, respectively. We 
studied the performance of polynomials of different degree. At variance with Ref. Q , we observed that second-degree 
polynomials do not give a sufficiently good description of the grid. Fourth-degree polynomials perform better but do 
not lead to a significant improvement of x 2 - 

Using the covariance matrix obtained in the approximation procedure, the errors of the coefficients in the polyno- 
mials are propagated as theoretical errors to the observables we need to fit, denoted as ( 5V app in Eq. ([7]). 
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Once the parameter dependence was described by the polynomials, the parameters were fitted to the data by using 
MINUIT, using the Migrad method [7j. Approximately 150 iterations were needed by the program in order to find 
the lowest \ 2 within the allowed limits set by the grid. This minimization took only few seconds. 

If our analytical grid description is good enough, we can expect the number of iterations to be the same to if we 
used the true MC instead of the polynomial approximation. Assuming that running the generator once with the 
current statistics takes approximately 6 hours of CPU time, fitting the Monte Carlo parameters with a conventional 
iterative way is expected to take 150x6 hours. Clearly, in such case one is forced to drastically reduce the statistics, 
and the fit could be influenced by statistical fluctuations. In addition, our method allowed us to quickly remake the 
fit by feeding MINUIT with different starting values. In this way we reduced the risk of finding a local minimum. 

IV. RESULTS 

The best-fit values of the parameters are quoted in Tab. [I] 





Range 


N (GeV- 2 ) 


B 


D 


X 2 /n.d.f. 


Fit 1 


x Bj < 0.005, 
Q 2 > 4.5 GeV 2 


0.805 ± 0.032 


0.030 ± 0.006 


(fixed) 


2.0 


Fit 2 


x B j < 0.005, 
Q 2 > 4.5 GeV 2 


0.417 ±0.030 


0.125 ± 0.010 


-9.2 ± 1.3 


1.6 


Fit 3 


full 


0.582 ±0.016 


0.070 ± 0.004 


(fixed) 


6.2 


Fit 4 


full 


0.368 ±0.015 


0.140 ±0.006 


-8.03 ± 0.66 


4.6 



TABLE I: Best fit parameters and x 2 / n -d.f. for the fits described in the text. 



To have an idea of the performance of our fit, we can compare Fit 1 with earlier uGDF fits [lil EH [13] ■ In particular, 
we chose to compare our fit to the J2003 set 2 (JSET2) uGDF [l6j . which is the one with the closest conditions to 
ours. In that set, parameter B is set to zero. To compare the quality of the description, we ran CASCADE with a 
statistics of 2.5M events, and we found that our fit gives a x 2 /n.d.f. = 1.4, while the old set gives x 2 /n.d.f. = 2.1. 
In other words, we found parameters that give a better description of the data, giving us confidence in our fitting 
method. In Fig. [3] we show the results of Fit 1, Fit 2, and JSET2 compared to the data. The results of Fit 3 and 4 
are shown in Fig. 01 

In order to check if our minimization approach works, we scanned the parameter values around their best values 
to check if we can indeed identify the signs of the presence of a minimum of \ 2 - I n Fig- E] and [5] we show how \ 2 
changes as a function of each of the three parameters used in Fit 2 and Fit 4, while fixing the other two to their 
best-fit value. The scans were carried out using both the Monte Carlo generator directly and the grid approximation. 
For comparison, we show also the results obtained using second- and fourth-degree polynomial approximations. First 
of all, we observe that the profile has in all cases a parabolic shape and the position of the minimum is clearly visible. 
This gives us once again confidence in the reliability of the procedure. We see also that the position of the minimum 
and the shape of \ 2 from the MC computation are similar to what is obtained from the grid approximation with 
third-degree polynomials. The position of the minimum is similar to what is found using the fourth-degree polynomial 
approximation, but quite different to what is found using the second-degree polynomial approximation. The value 
of the minimum % 2 is not the same for MC and grid approximation (1.4 versus 1.6 for Fit 2; 3.2 versus 4.6 for Fit 
4). This is due to the fact that the approximation errors, 8Y? PP in Eq. (J7|) are typically smaller than the MC errors, 
jl^MC in Eq. (JXJ) , and lead to a higher x 2 - This difference becomes irrelevant only if SY^ 10 is negligible compared to 
the experimental errors SYf*. 

At this point, we can briefly discuss the physical meaning of our results. First of all, we can conclude that in the 
extended XBj and Q 2 range of Fit 3 and 4 we cannot achieve a good description of the F 2 data with Cascade. This 
is not surprising, since the generator starts from a purely gluonic distribution function. The description is in general 
better at lower values of XBj, where gluons dominate. 

Secondly, we conclude that in the restricted XBj and Q 2 range of Fit 1 and 2, a good description of the data is 
obtained when we include parameter D to give more flexibility to the functional form of the gluon distribution. 

A few considerations can be made also on the value of parameter B, governing the low- a; behavior of the gluon 
distribution. In all fits, the value is higher than previous studies [i~3l. Ha. fl7| . This is for instance the reason of the 
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FIG. 3: F2(x) structure function measured by the HI Collaboration [(J together with simulations based on the Cascade event 
generator, using the unintegrated gluon PDF obtained in Ref. [l3|] (dashed line), and using the parameters obtained in Fit 1 of 
the present work (solid line). The hatched areas were excluded from the fit. 

different behaviors at low xbj and Q 2 in Fig. [3] The value of B turns out to be even higher in the fits with a free D 
parameter. 

Not surprisingly, we observe that the parameters of the gluon distribution function are in general different from 
the ones obtained in global fits at similar input scales [13, . To start with, global fits include many more data sets 
than we presently considered. However, there are more fundamental differences between the physics included in the 
generators or in global fits. Therefore, to achieve the best possible description of data with Monte Carlo generators, 
the parameters of the distribution functions should be tuned independently of global fits. 

V. CONCLUSIONS 

In this work we analyzed a method to tune the parameters of Monte Carlo event generators using a set of ex- 
perimental observables. First, the generator is run with a few different values of the parameters to tune. For each 
observable, a grid of predictions is thus obtained. The resulting grids are approximated by analytic functions of the 
parameters. Finally, the analytic functions are used in place of the generator itself to perform a x 2 fit to the data 
and obtain the best values for the parameters. The method is significantly faster than a direct use of the generator, 
as the construction of the grids typically requires fewer calls to the generator than a direct fit and all grid points 
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FIG. 4: F2(x) structure function measured by the HI Collaboration jy] together with simulations based on the Cascade event 
generator, using the parameters of the Fit 3 (dashed line), and Fit 4 (solid line) of the present work. In contrast to Fig. [3] the 
whole XBj and Q 2 range has been included in the fit. 



can be computed in parallel. There is no need to rerun the generator to repeat the fit with different initial values 
of the parameters, nor if the experimental data change (for instance if the statistics increase). If data for different 
observables become available, the generator has to be run to build grids for these new observables, but the old grids 
remain still valid for the old observables. 

The main limit of the approach is that the limited parameter ranges have to be fixed a priori, since the grids have to 
be built once and for all before the fitting is actually performed. It is possible to improve the choice of the parameter 
ranges with hindsight, after the first attempt. However, this approach might be time consuming and the minimization 
can still fail if the data cannot constrain the value of one or more parameters. 

As a concrete example, we applied the method to find the best values for the parameters of the unintegrated gluon 
distribution function used in the Cascade Monte Carlo generator. To constrain the parameter values, we used the 
data on the F2 structure function in inclusive deep inelastic scattering. We performed four different types of fit, 
changing the range of x Bj and Q 2 and the number of free parameters under consideration. 

Taking the second version of the fit as an illustration, we chose 150 combinations of parameter values and produced 
a grid of predictions for each one of the 122 data points. The grid was approximated by a third order polynomial with 
a total of 35 coefficients. The best approximation was searched for using the method of Single Value Decomposition 
to guarantee a fast and reliable search. The quality of the approximation was found to be very good, with x 2 /n.d.f . = 
1.05. 
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FIG. 5: x 2 profiles as a function of the parameters of the input uGDF for Fit 2. Dots: using the MC generator directly. 
Lines: using three different versions of the polynomial approximation. The vertical line and band indicates the position of the 
minimum and its error (obtained using the third-degree polynomial approximation). 




FIG. 6: Same as Fig. [5] but for Fit 4. 



Finally, we found the best values of the parameters by a second x 2 minimization, using the difference between the 
experimental measurements and the analytic approximation of the generator output to define the x 2 function. The 
minimization was done using MINUIT. 

We checked that the best-fit values of the parameters give a good description of the data, with a x 2 /d.o.f. = 1.6. 
By scanning the dependence of x 2 on the single parameters, we strengthened the evidence that the fit found the 
parameter values that describe the data best. 

By including more data in the fit, the method described in this work can be applied to better constrain the 
parameters of the unintegrated gluon distribution, including those describing the intrinsic fct-dependence. 
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Appendix A: Treatment of correlated systematic uncertainties 

A convenient method to determine the quality of a fit is to use a least square minimization. This ansatz is justified 
by the assumption that the errors are Gaussian distributed. 

A set of measurements {di} will in general deviate from a set of corresponding predictions {U}. The deviations are 
caused by various kinds of uncertainties as there is for each data point a statistical uncertainty af at , an uncorrelated 
systematic uncertainty Ui and, coming from n sys sources, the correlated systematic uncertainties {Pn,Pa, •••>An B s }- 
The measurement is then related to the prediction by: 



n sys 

di = U + TiOLi + ^ r 'kPik, 
k=l 



(Al) 
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where af at and Ui are added in quadrature to form a unified uncorrelated error on — 

VK dat ) + • The ri, r£ 

express the individual shifts of the data points by the uncertainties and are Gaussian distributed with zero mean and 
unit variance and assumed to be independent of each other. 

A x 2 that includes a proper treatment of correlated systematic errors can be calculated as follows (see [l0( for a 
derivation) : 

X 2 ({«}, {r'» = E 2 + g r ^ (A2) 

where it can be seen that x 2 depends both on {a} (the parameters entering the predictions ti) and the random 
parameters {r'}. The latter ones can be expressed as 

r' k ({a})=J2(A- 1 )kk'B k ,, (A3) 
fc'=i 

which leads to the r'-independent form 

x 2 ({«» = E - E B * (A- 1 )*** 3 * (A4) 

i=l * fc,fc'=l 

with 

D V" -1 fiik(di — U) , * r ■ fiikfiik' / \ r\ 

B k = } 5 and ^fcfc' = °kk> + } o ( A5 ) 

1=1 1 4=1 1 

For the systematic errors, in this work we used the ansatz proposed by CTEQ, i.e., 

fkk = e ikdi (A6) 

with eik being the relative systematic error. 
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